Introduction

As of Monday, November 21st, 2022, the current makeup of the 2022 Congress is that 430/435 seats have been called and currently have confirmed 218 Republican seats, 213 Democratic seats. This means the Republicans did take control of the House, however, speculations of a “Red Wave” did not hold up. The Democrats barely held to the Senate, with 50 seats right now, and Georgia heads to runoff. If Warnock wins as many predict, the final makeup will be 51 Democrats and 49 Republicans in the Senate.

#Recap of Model and Prediction #

My model attempted to use aggregate data from 2018 to predict an outcome for 2022. I will go into my challenges with this later on. The general makeup of my model was that I used GLM and LM with the inputs of national economic data (including national 2018 GDP and RDI), local economic data (unemployment), weighted polling (weighted from 2018 polling data on closeness to election = more weight), average expert prediction (1 for Strong Dem to 7 for Strong Rep), turnout, as well as the incumbency status of the winning candidate to predict the 2022 outcomes.

As you can see, my model had many inputs and while running the regression, many of these variables didn’t have much of an impact on my results.

#Graphic Performance of My Model #

Observations 7416
Dependent variable DemVotesMajorPercent
Type Linear regression
𝛘²(5) 1802572.07
Pseudo-R² (Cragg-Uhler) 0.74
Pseudo-R² (McFadden) 0.16
AIC 53899.09
BIC 53947.47
Est. S.E. t val. p
(Intercept) 91.61 1.09 83.79 0.00
avg -6.50 0.05 -141.24 0.00
Unemployed_prct 0.36 0.20 1.86 0.06
winner_candidate_incIncumbent 3.47 0.25 13.87 0.00
Receipts -0.00 0.00 -8.91 0.00
turnout -27.93 1.31 -21.24 0.00
Standard errors: MLE
##                                          2.5 %            97.5 %
## (Intercept)                    89.468288488851  93.7541999508942
## avg                            -6.590048492167  -6.4096550853631
## Unemployed_prct                -0.020256643106   0.7464684345044
## winner_candidate_incIncumbent   2.977119614558   3.9571145049945
## Receipts                       -0.000001467228  -0.0000009379973
## turnout                       -30.504819396926 -25.3510452613692
## Rows: 50 Columns: 26
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (2): name, type
## dbl (24): fips, totalvote, vote1, vote2, vote3, vote4, vote5, vote6, vote7, ...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## Rows: 436 Columns: 31
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (31): FIPS, STATE_FIPS, CD, Geographic Name, Geographic Subtype, Total V...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## Warning in final_test$`Total Vote`[final_test$`Total Vote` == 0] <-
## (final_test$Democratic + : number of items to replace is not a multiple of
## replacement length

Below were my confidence intervals for my prediction

final_22 <-rmapshaper::ms_simplify(final_22, keep = 0.01)
margin_plot <- ggplot() +
geom_sf(data= final_22, aes(fill= margin22), color = "grey60", size = 0.05) +
scale_fill_gradient2(low = "red",
mid = "white",
high = "blue",
midpoint = 0,
name = "Dem Vote Margin") +
coord_sf(xlim = c(-124.43, -66.57), ylim = c(23, 52), expand = FALSE)  +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.title.y=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank(),
plot.title = element_text(margin = margin(0,0,10,0), hjust = 0.5)) +
labs(fill = "DemMajorVotePct Margin Difference", title = "Model's Margin Difference per District (actual - pred)")

ggplotly(margin_plot)

As you can see above, I have plotted the margin (Actual - My model’s prediction) of the Democratic vote percent in 2022. My model greatly overestimated Democratic performance even in Blue states. There are some interesting things to note, one is how it underpredicted performance in states like South Carolina, Montana, and parts of Florida. I think this is an interesting finding when considered in conjuction with certain speculations political scientists and party leadership has had about an increase in Dem mobilization in certain Red stronghold states.

#Where I Went Wrong #

I think the biggest mistake I made was not doing a pooled model. I think most of the reason I had such strong model predictions for Democrats is that I only used data from the year 2018.

#If I Were to Do It Again …#